
Search by job, company or skills
Ready to build the future with AI
At Genpact, we don't just keep up with technology-we set the pace. AI and digital innovation are redefining industries, and we're leading the charge. Genpact's AI Gigafactory, our industry-first accelerator, is an example of how we're scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to agentic AI, our breakthrough solutions tackle companies most complex challenges.
If you thrive in a fast-moving, innovation-driven environment, love building and deploying cutting-edge AI solutions, and want to push the boundaries of what's possible, this is your moment.
Genpact (NYSE: G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions - we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation, our teams implement data, technology, and AI to create tomorrow, today. Get to know us at genpact.com and on LinkedIn, X, YouTube, and Facebook.
Inviting applications for the role of Lead Consultant - Observability Engineer - SRE & Datadog Specialist
We are seeking a highly skilled Observability Engineer with deep expertise in Site Reliability Engineering (SRE) principles and Datadog observability tooling.
This role will be instrumental in driving end-to-end observability maturity, optimizing system performance, and embedding SRE practices across global platforms.
The role is also expected to actively research, prototype, and roll out new observability capabilities, automation patterns, and emerging practices (including agentic and AI-assisted approaches) to continuously evolve the observability ecosystem.
The ideal candidate is methodical, detail-oriented, and demonstrates strong ownership in building resilient observability solutions that deliver reliability insights, enable proactive incident management, and ensure operational excellence.
Responsibilities
1)Observability Platform Engineering
.Design, implement, and maintain Datadog-based observability solutions across infrastructure, platforms, and applications.
.Develop and optimize dashboards, monitors, and alerts to support proactive detection and triage of performance and reliability issues.
.Integrate custom telemetry pipelines (metrics, logs, traces, events) aligned with Open Telemetry and platform architecture standards.
.Manage instrumentation strategies to ensure accurate and consistent coverage across services.
2. Site Reliability Engineering (SRE) Practices
.Apply SRE principles to improve service reliability, availability, and performance.
.Define and track SLIs, SLOs, and SLAs for critical systems, and build feedback loops to continuously enhance service health.
.Automate manual operational processes using Python, Terraform, or CI/CD tooling.
.Collaborate with development and platform teams to identify resilience patterns and embed observability by design.
3. Datadog Expertise & Ecosystem Enablement
.Serve as the subject matter expert (SME) for Datadog - advising on advanced configurations, integrations, and performance optimization.
.Enable distributed tracing, APM, RUM, and synthetics capabilities to support end-to-end visibility.
.Implement and maintain Datadog Terraform configurations, templates, and governance models for enterprise consistency.
.Conduct performance tuning and cost optimization for Datadog usage across global environments.
4. Incident & Problem Management
.Partner with the Operations and Platform teams to analyze incident patterns and provide root cause insights through observability data.
.Lead post-incident reviews and recommend observability-driven improvements to prevent recurrence.
.Build automation and correlation mechanisms for real-time alert enrichment and contextual diagnostics.
5. Continuous Improvement, R&D, and Automation
.Proactively identify gaps, inefficiencies, and manual workflows within the observability ecosystem and design automation-first solutions.
.Research, prototype, and evaluate new observability patterns, tools, and techniques, including AI- and agent-based approaches, before scaling them into production.
.Build reusable frameworks, templates, and toolkits to reduce toil and enable self-service adoption across engineering teams.
.Continuously improve observability signal quality, alert precision, and operational efficiency through experimentation and iteration.
.Translate learnings from incidents, postmortems, and usage data into systemic improvements rather than one-off fixes.
Qualifications we seek in you!
Minimum Qualifications
.Bachelor's degree in Computer Science, Information Systems, or a related field.
.Experience in observability engineering or SRE roles within large-scale distributed systems.
Preferred Qualifications/ Skills
.Deep, hands-on expertise with Datadog, including APM, Logs, Metrics, RUM, and Synthetics.
.Strong proficiency in:
oInfrastructure as Code (IaC): Terraform
oAutomation: Python, Bash, or similar scripting languages
oCI/CD pipelines: Jenkins, GitLab, or GitHub Actions
.Experience supporting multi-cloud environments (AWS, GCP, Azure).
.Familiarity with container orchestration (Kubernetes, ECS) and service mesh observability.
.Understanding of data visualization and analytics for operational reporting.
.Exposure to AI-driven observability enhancements or integration with LLM-based insights (a plus).
.Certification in Datadog, AWS, or GCP is advantageous.
Why join Genpact
.Lead AI-first transformation - Build and scale AI solutions that redefine industries
.Make an impact - Drive change for global enterprises and solve business challenges that matter
.Accelerate your career-Gain hands-on experience, world-class training, mentorship, and AI certifications to advance your skills
.Grow with the best - Learn from top engineers, data scientists, and AI experts in a dynamic, fast-moving workplace
.Committed to ethical AI - Work in an environment where governance, transparency, and security are at the core of everything we build
.Thrive in a values-driven culture - Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress
Come join the 140,000+ coders, tech shapers, and growth makers at Genpact and take your career in the only direction that matters: Up.
Let's build tomorrow together.
Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color, religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation.
Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a %27starter kit,%27 paying to apply, or purchasing equipment or training.
Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose - the relentless pursuit of a world that works better for people - we serve and transform leading enterprises, including the Fortune Global 500, with our deep business and industry knowledge, digital operations services, and expertise in data, technology, and AI.
Job ID: 143488105